HEALPix logo

UXarray for Advanced HEALPix Analysis & Visualization

In this section, you’ll learn:

  • Using the uxarray package to perform advanced analysis operators over HEALPix data such as non-conservative zonal means, etc.

Prerequisites

Concepts

Importance

Notes

UXarray

Necessary

HEALPix overview

Necessary

Time to learn: 30 minutes


import cartopy.crs as ccrs
import holoviews as hv
import intake
import uxarray as ux

Open data catalog

Tip

We assume, you have already gone over the previous section, UXarray for Basic HEALPix Statistics & Visualization. If not and if you need to learn about data catalogs in general and the data we will use throughout this notebook, we recommend to check that section first.

Let us open the online catalog from the WCRP’s Digital Earths Global Hackathon 2025 using intake and read the output of the ICON run ngc4008, which is stored in the HEALPix format:

# Final data catalog location (once hackathon website (https://digital-earths-global-hackathon.github.io/) updated)
# cat_url='https://digital-earths-global-hackathon.github.io/catalog/catalog.yaml'
# Interim data catalog location
cat_url = "https://raw.githubusercontent.com/digital-earths-global-hackathon/catalog/refs/heads/ncar/online/main.yaml"
cat = intake.open_catalog(cat_url)
model_run = cat.icon_ngc4008

We can look into a fine resolution dataset at zoom level = 10 in it as Xarray.Dataset:

ds = model_run(zoom=9, time="P1D").to_dask()
ds
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/intake_xarray/base.py:21: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  'dims': dict(self._ds.dims),
<xarray.Dataset> Size: 232TB
Dimensions:                              (time: 10958, depth_half: 73,
                                          cell: 3145728, level_full: 90,
                                          crs: 1, depth_full: 72,
                                          soil_depth_water_level: 5,
                                          level_half: 91,
                                          soil_depth_energy_level: 5)
Coordinates:
  * crs                                  (crs) float32 4B nan
  * depth_full                           (depth_full) float32 288B 1.0 ... 5....
  * depth_half                           (depth_half) float32 292B 0.0 ... 5....
  * level_full                           (level_full) int32 360B 1 2 3 ... 89 90
  * level_half                           (level_half) int32 364B 1 2 3 ... 90 91
  * soil_depth_energy_level              (soil_depth_energy_level) float32 20B ...
  * soil_depth_water_level               (soil_depth_water_level) float32 20B ...
  * time                                 (time) datetime64[ns] 88kB 2020-01-0...
Dimensions without coordinates: cell
Data variables: (12/103)
    A_tracer_v_to                        (time, depth_half, cell) float32 10TB ...
    FrshFlux_IceSalt                     (time, cell) float32 138GB ...
    FrshFlux_TotalIce                    (time, cell) float32 138GB ...
    Qbot                                 (time, cell) float32 138GB ...
    Qtop                                 (time, cell) float32 138GB ...
    Wind_Speed_10m                       (time, cell) float32 138GB ...
    ...                                   ...
    vas                                  (time, cell) float32 138GB ...
    w                                    (time, depth_half, cell) float32 10TB ...
    wa_phy                               (time, level_half, cell) float32 13TB ...
    zg                                   (level_full, cell) float32 1GB ...
    zghalf                               (level_half, cell) float32 1GB ...
    zos                                  (time, cell) float32 138GB ...

Create UXarray Datasets from HEALPix

We can use from_healpix as follows to open a HEALPix grid from xarray.Dataset:

uxds = ux.UxDataset.from_healpix(ds)
uxds
<xarray.UxDataset> Size: 232TB
Dimensions:                              (time: 10958, depth_half: 73,
                                          n_face: 3145728, level_full: 90,
                                          crs: 1, depth_full: 72,
                                          soil_depth_water_level: 5,
                                          level_half: 91,
                                          soil_depth_energy_level: 5)
Coordinates:
  * crs                                  (crs) float32 4B nan
  * depth_full                           (depth_full) float32 288B 1.0 ... 5....
  * depth_half                           (depth_half) float32 292B 0.0 ... 5....
  * level_full                           (level_full) int32 360B 1 2 3 ... 89 90
  * level_half                           (level_half) int32 364B 1 2 3 ... 90 91
  * soil_depth_energy_level              (soil_depth_energy_level) float32 20B ...
  * soil_depth_water_level               (soil_depth_water_level) float32 20B ...
  * time                                 (time) datetime64[ns] 88kB 2020-01-0...
Dimensions without coordinates: n_face
Data variables: (12/103)
    A_tracer_v_to                        (time, depth_half, n_face) float32 10TB ...
    FrshFlux_IceSalt                     (time, n_face) float32 138GB ...
    FrshFlux_TotalIce                    (time, n_face) float32 138GB ...
    Qbot                                 (time, n_face) float32 138GB ...
    Qtop                                 (time, n_face) float32 138GB ...
    Wind_Speed_10m                       (time, n_face) float32 138GB ...
    ...                                   ...
    vas                                  (time, n_face) float32 138GB ...
    w                                    (time, depth_half, n_face) float32 10TB ...
    wa_phy                               (time, level_half, n_face) float32 13TB ...
    zg                                   (level_full, n_face) float32 1GB ...
    zghalf                               (level_half, n_face) float32 1GB ...
    zos                                  (time, n_face) float32 138GB ...

Data variable of interest

Then let us pick a variable from the dataset, which will give us an uxarray.UxDataArray:

uxda = uxds["tas"]
uxda
<xarray.UxDataArray 'tas' (time: 10958, n_face: 3145728)> Size: 138GB
[34470887424 values with dtype=float32]
Coordinates:
  * time     (time) datetime64[ns] 88kB 2020-01-02 2020-01-03 ... 2050-01-01
Dimensions without coordinates: n_face
Attributes:
    cell_methods:   time: mean cell: mean
    component:      atmo
    grid_mapping:   crs
    long_name:      temperature in 2m
    standard_name:  air_temperature
    units:          K
    vgrid:          height_2m

Global mean and plot

Computing the global air temperature mean (at the first timestep) and also having a quick plot of it would be a good idea to have as references to compare the upcoming analyses & visualizations to them:

%%time
print(
    "Global air temperature average on ",
    uxda.time[0].values,
    ": ",
    uxda.isel(time=0).mean().values,
    " K",
)
Global air temperature average on  2020-01-02T00:00:00.000000000 :  286.3096  K
CPU times: user 480 ms, sys: 291 ms, total: 771 ms
Wall time: 1.4 s
%%time
uxda.isel(time=0).plot(
    projection=ccrs.Robinson(),
    cmap="inferno",
    features=["borders", "coastline"],
    title="Global temperature",
    width=700,
)
CPU times: user 16.9 s, sys: 1.17 s, total: 18.1 s
Wall time: 16 s
WARNING:param.GeoOverlayPlot00477: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.

Rasterized Point Plots

When working with a higher-resolution dataset at a global scale, it’s not always practical to render each cell as a polygon. Instead, we can rasterize the center of each pixel.

# Controls the size of each pixel (smaller value leads to larger pixels)
pixel_ratio = 0.5

uxda.isel(time=0).plot.points(
    projection=ccrs.Robinson(),
    rasterize=True,
    dynamic=False,
    width=700,
    pixel_ratio=pixel_ratio,
    cmap="inferno",
    title=f"Global Air Temperature, pixel_ratio={pixel_ratio}",
)
WARNING:param.GeoRasterPlot00785: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.

If we decrease the size of each pixel (by setting the pixel ratio to a higher value), we can start to see missing values, which is due to a lower density of points near the poles, leading to some pixels not containing any of our original points.

Because of this, it’s useful to try a few pixel_ratio values and see which one works best for your given resolution.

# Controls the size of each pixel (smaller value leads to larger pixels)
pixel_ratio = 2.0

uxda.isel(time=0).plot.points(
    projection=ccrs.Robinson(),
    rasterize=True,
    dynamic=False,
    width=700,
    pixel_ratio=pixel_ratio,
    cmap="inferno",
    title=f"Global Air Temperature with bad pixel size selection, pixel_ratio={pixel_ratio}",
)
WARNING:param.GeoRasterPlot00982: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.

Cross-Sections

UXarray provides several spatial cross-section methods:

  • Constant-Latitude Slice Extract data along a fixed latitude.

  • Constant-Longitude Slice Extract data along a fixed longitude.

  • Latitudinal-Interval Slice Extract data between two latitude bounds.

  • Longitudinal-Interval Slice Extract data between two longitude bounds.

To illustrate, we’ll use a zoom level of `4.

uxda_coarse = ux.UxDataset.from_healpix(model_run(zoom=4, time="P1D").to_dask())["tas"]
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/intake_xarray/base.py:21: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  'dims': dict(self._ds.dims),

Constant Latitude

boulder_lat = 40.0190

uxda_cross_lat = uxda_coarse.cross_section.constant_latitude(boulder_lat)
uxda_cross_lat.isel(time=0).plot(
    rasterize=False,
    projection=ccrs.Robinson(),
    global_extent=True,
    cmap="inferno",
    features=["coastline"],
    title=f"Global temperature cross-section at {boulder_lat} degrees latitude",
    width=700,
    colorbar=False,
)
WARNING:param.GeoOverlayPlot01102: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.

Since cross sections return a new instance of our data, we can perform operations directly on them.

print(
    f"Mean at {boulder_lat} degrees lat (Boulder, CO, USA): {uxda_cross_lat.mean().values} K"
)
Mean at 40.019 degrees lat (Boulder, CO, USA): 286.6632080078125 K

Latitude Interval

uxda_lat_interval = uxda_coarse.cross_section.constant_latitude_interval(
    [boulder_lat - 15, boulder_lat + 15]
)
uxda_lat_interval.isel(time=0).plot(
    rasterize=False,
    projection=ccrs.Robinson(),
    global_extent=True,
    cmap="inferno",
    clim=(220, 310),
    features=["coastline"],
    title=f"Global temperature cross-section at the latitude interval [{boulder_lat-5},{boulder_lat+5}] degrees",
    width=700,
    colorbar=False,
)
WARNING:param.GeoOverlayPlot01276: Due to internal constraints, when aspect and width/height is set, the bokeh backend uses those values as frame_width/frame_height instead. This ensures the aspect is respected, but means that the plot might be slightly larger than anticipated. Set the frame_width/frame_height explicitly to suppress this warning.
print(
    f"Mean at the latitude interval of [{boulder_lat-5},{boulder_lat+5}] degrees (-/+15 degrees Boulder, CO, USA): {uxda_lat_interval.mean().values} K"
)
Mean at the latitude interval of [35.019,45.019] degrees (-/+15 degrees Boulder, CO, USA): 286.2819519042969 K

Non-Conservative Zonal Mean

Calculating the zonal mean is easy by providing the latitude range between -90 and 90 degrees with a step size in degrees:

zonal_mean_tas = uxda.isel(time=0).zonal_mean(lat=(-90, 90, 5))
(
    uxda.isel(time=0).plot(
        cmap="inferno",
        # periodic_elements="split",
        height=300,
        width=600,
        colorbar=False,
        ylim=(-90, 90),
    )
    + zonal_mean_tas.plot.line(
        x="tas_zonal_mean",
        y="latitudes",
        height=300,
        width=180,
        ylabel="",
        ylim=(-90, 90),
        xlim=(220, 310),
        # xticks=[220, 250, 280, 310],
        yticks=[-90, -45, 0, 45, 90],
        grid=True,
    )
).opts(title="Temperature and its Zonal means at every 5 degrees latitude")

Remapping & Custom Grid Topology

Now, let’s take a look at an example of UXarray’s remapping functionality. UXarray currently supports two remapping methods:

  • Nearest Neighbor Assigns each target point the value of its closest source point.

  • Inverse Distance Weighted Computes each target point’s value as the weighted average of nearby source points, with weights inversely proportional to distance.

For this example, we’ll use the Nearest Neighbor approach.

Region of Interest: Chicago

Before remapping, we restrict our data to a 2.5°×2.5° bounding box centered on Chicago, Illinois.

chicago_lon = -87.6298
chicago_lat = 41.8781

degree_span = 2.5
half_span = degree_span / 2.0

Since we will be using this subset of data for remapping, let’s make the bounding box slightly larger than the extent of our destination grid.

degree_eps = 0.25
half_span_eps = half_span + degree_eps

healpix_subset = uxda.isel(time=0).subset.bounding_box(
    (chicago_lon - half_span_eps, chicago_lon + half_span_eps),
    (chicago_lat - half_span_eps, chicago_lat + half_span_eps),
)
healpix_subset.plot(
    title="Temperature Subset around Chicago (HEALPix zoom 9)",
    backend="matplotlib",
    width=500,
    height=500,
    aspect=2,
    cmap="inferno",
    features=["states", "rivers"],
)
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/110m_cultural/ne_110m_admin_1_states_provinces_lakes.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/110m_physical/ne_110m_rivers_lake_centerlines.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)

Creating our Destination Grid

UXarray wraps SciPy’s mesh generation methods to support representing point cloud data build around the following algorithms

  • Delaunay Triangulation

  • Voronoi Tesselation

Let’s create a grid of points centered around Chicago and triangulate it to create our destination grid.

import numpy as np

n_points = 10

# Longitude Vector
lon_vals = np.linspace(
    chicago_lon - half_span, chicago_lon + half_span, n_points, dtype=np.float64
)

# Latitude Vector
lat_vals = np.linspace(
    chicago_lat - half_span, chicago_lat + half_span, n_points, dtype=np.float64
)

# Meshgrid
lon, lat = np.meshgrid(lon_vals, lat_vals)

# flatten for processing
lon_flat = lon.ravel()
lat_flat = lat.ravel()

# Create a mask of boundary points
min_lon, max_lon = lon_vals[0], lon_vals[-1]
min_lat, max_lat = lat_vals[0], lat_vals[-1]

mask = (
    np.isclose(lon_flat, min_lon)
    | np.isclose(lon_flat, max_lon)
    | np.isclose(lat_flat, min_lat)
    | np.isclose(lat_flat, max_lat)
)

boundary_points = np.flatnonzero(mask)
destination_grid = ux.Grid.from_points(
    (lon_flat, lat_flat),
    method="regional_delaunay",
    boundary_points=boundary_points,
)

destination_grid.plot(
    title="Triangulated Points (Destination Grid)",
    backend="matplotlib",
    width=500,
    height=500,
    aspect=2,
    features=["states", "rivers"],
)

Remapping

Now that we have both our subset of the original data and our custom destination grid, let’s remap!

uxda_remapped = uxda.isel(time=0).remap.nearest_neighbor(
    destination_grid, coord_type="cartesian"
)
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/uxarray/grid/coordinates.py:254: UserWarning: This cannot be guaranteed to work correctly on concave polygons
  warnings.warn("This cannot be guaranteed to work correctly on concave polygons")
/home/runner/miniconda3/envs/healpix-cookbook-dev/lib/python3.10/site-packages/uxarray/grid/coordinates.py:254: UserWarning: This cannot be guaranteed to work correctly on concave polygons
  warnings.warn("This cannot be guaranteed to work correctly on concave polygons")
(
    healpix_subset.plot(
        title="Temperature Subset around Chicago (HEALPix zoom 9)",
        backend="matplotlib",
        width=600,
        height=600,
        aspect=2,
        cmap="inferno",
        features=["states", "rivers"],
    )
    + uxda_remapped.plot(
        title="Remapped Temperature",
        backend="matplotlib",
        width=600,
        height=600,
        aspect=2,
        cmap="inferno",
        features=["states", "rivers"],
    )
).cols(1).opts(fig_size=150)